Minimum Expected Distortion in Gaussian Layered 
Broadcast Coding with Successive Refinement 



Chris T. K. Ng*, Deniz Giindiiz^, Andrea J. Goldsmith*, and Elza Erkipt 

*Dept. of Electrical Engineering, Stanford University, Stanford, CA 94305 USA 
TDept. of Electrical and Computer Engineering, Polytechnic University, Brooklyn, NY 11201 USA 
Email: *{ngctk,andrea}@wsl. stanford.edu, tdgundu01@utopia.poly.edu, elza@poly.edu 



Abstract — A transmitter without channel state information 
(CSI) wishes to send a delay-limited Gaussian source over a 
slowly fading channel. The source is coded in superimposed 
layers, with each layer successively refining the description in the 
previous one. The receiver decodes the layers that are supported 
by the channel realization and reconstructs the source up to 
a distortion. In the limit of a continuum of infinite layers, the 
optimal power distribution that minimizes the expected distortion 
is given by the solution to a set of linear differential equations 
in terms of the density of the fading distribution. In the optimal 
power distribution, as SNR increases, the allocation over the 
higher layers remains unchanged; rather the extra power is 
allocated towards the lower layers. On the other hand, as the 
bandwidth ratio b (channel uses per source symbol) tends to 
zero, the power distribution that minimizes expected distortion 
converges to the power distribution that maximizes expected 
capacity. While expected distortion can be improved by acquiring 
CSI at the transmitter (CSIT) or by increasing diversity from 
the realization of independent fading paths, at high SNR the 
performance benefit from diversity exceeds that from CSIT, 
especially when b is large. 

I. Introduction 

We consider the transmission of a delay-limited Gaussian 
source over a slowly fading channel in the absence of channel 
state information (CSI) at the transmitter. As the channel 
is non-ergodic, source-channel separation is not necessarily 
optimal. We consider the layered broadcast coding scheme 
in which each superimposed source layer successively refines 
the description in the previous one. The receiver decodes 
the layers that are supported by the channel realization and 
reconstructs the source up to a distortion. We are interested in 
minimizing the expected distortion of the reconstructed source 
by optimally allocating the transmit power among the layers 
of codewords. 

The broadcast strategy is proposed in [1] to characterize 
the set of achievable rates when the channel state is unknown 
at the transmitter. In the case of a Gaussian channel under 
Rayleigh fading, [2], [3] describe the layered broadcast coding 
approach and derive the optimal power allocation that maxi- 
mizes the expected capacity. In the transmission of a Gaussian 
source over a Gaussian channel, uncoded transmission is opti- 
mal [4] in the special case when the source bandwidth equals 
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the channel bandwidth [5]. For other bandwidth ratios, hybrid 
digital-analog joint source-channel transmission schemes are 
studied in [6]-[8], where the codes are designed to be optimal 
at a target SNR but degrade gracefully should the realized 
SNR deviate from the target. 

The distortion exponent, defined as the exponential decay 
rate of the expected distortion in the high SNR regime, 
is investigated in [9] in the transmission of a source over 
two independently fading channels. For quasi-static multiple- 
antenna Rayleigh fading channels, distortion exponent upper 
bounds and achievable joint source-channel schemes are stud- 
ied in [10]— [12]. The expected distortion of the layered source 
coding with progressive transmission (LS) scheme proposed 
in [11] is analyzed in [13] for a finite number of layers at 
finite SNR. Concatenation of broadcast channel coding with 
successive refinement [14], [15] source coding is shown in 
[10], [1 1] to be optimal in terms of the distortion exponent for 
multiple input single output (MISO) and single input multiple 
output (SIMO) channels. Numerical optimization of the power 
allocation with constant rate among the layers is examined 
in [16], while [17] considers the optimization of power and 
rate allocation and presents approximate solutions in the high 
SNR regime. The optimal power allocation that minimizes the 
expected distortion at finite SNR in layered broadcast coding 
is derived in [18] when the channel has a finite number of 
discrete fading states. This work extends [18] and considers 
the minimum expected distortion for channels with continuous 
fading distributions. In a related work in [19], the optimal 
power distribution that minimizes the expected distortion is 
derived using the calculus of variations method. 

The remainder of the paper is organized as follows. Sec- 
tion |II]presents the system model, and Section ITTTldescribes the 
layered broadcast coding scheme with successive refinement. 
The optimal power distribution that minimizes the expected 
distortion is derived in Section [IV] Section [V] considers 
Rayleigh fading channels with diversity, followed by conclu- 
sions in Section PVTl 

II. System Model 

Consider the system model illustrated in Fig.Q~| A transmit- 
ter wishes to send a Gaussian source over a wireless channel to 
a receiver, at which the source is to be reconstructed with a dis- 
tortion. Let the source be denoted by s, which is a sequence of 
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Fig. 1. Source-channel coding without CSI at the transmitter. 

independent identically distributed (iid) zero-mean circularly 
symmetric complex Gaussian (ZMCSCG) random variables 
with unit variance: s G C ~ £/V(0, 1). The transmitter and 
the receiver each have a single antenna and the channel is 
described by: y = Hx + n, where x £ C is the transmit 
signal, y G C is the received signal, and n G C ~ CAf(0, 1) 
is iid unit-variance ZMCSCG noise. 

Suppose the distribution of the channel power gain is 
described by the probability density function (pdf) 7(7), where 
7 = \h\ 2 and ft 6 C is a realization of H. The receiver has 
perfect CSI but the transmitter has only channel distribution 
information (CDI), i.e., the transmitter knows the pdf f(j) but 
not its instantaneous realization. The channel is modeled by a 
quasi-static block fading process: H is realized iid at the onset 
of each fading block and remains unchanged over the block 
duration. We assume decoding at the receiver is delay-limited; 
namely, delay constraints preclude coding across fading blocks 
but dictate that the receiver decodes at the end of each block. 
Hence the channel is non-ergodic. 

Suppose each fading block spans N channel uses, over 
which the transmitter describes K of the source symbols. We 
define the bandwidth ratio as b = N/K, which relates the 
number of channel uses per source symbol. At the transmitter 
there is a power constraint on the transmit signal E[|ir| 2 ] < P, 
where the expectation is taken over repeated channel uses over 
the duration of each fading block. We assume a short-term 
power constraint and do not consider power allocation across 
fading blocks. We assume K is large enough to consider the 
source as ergodic, and N is large enough to design codes that 
achieve the instantaneous channel capacity of a given fading 
state with negligible probability of error. 

At the receiver, the channel output y is used to reconstruct 
an estimate s of the source. The distortion D is measured by 
the mean squared error E[(s — s) 2 ] of the estimator, where the 
expectation is taken over the if-sequence of source symbols 
and the noise distribution. The instantaneous distortion of the 
reconstruction depends on the fading realization of the chan- 
nel; we are interested in minimizing the expected distortion 
Ejj[D], where the expectation is over the fading distribution. 

III. Layered Broadcast Coding with 
Successive Refinement 

We build upon the power allocation framework derived in 
[18], and first assume the fading distribution has M discrete 
states: the channel power gain realization is 7,; with probability 
Pi, for i = 1, . . . , M, as depicted in Fig. [2] Accordingly there 
are M virtual receivers and the transmitter sends the sum of M 
layers of codewords. Let layer i denote the layer of codeword 
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Fig. 2. Layered broadcast coding with successive refinement. 

intended for virtual receiver i, and we order the layers as 7^/ > 
• ■ ■ > 7i > 0. We refer to layer M as the highest layer and 
layer 1 as the lowest layer. Each layer successively refines the 
description of the source s from the layer below it, and the 
codewords in different layers are independent. Let Pi be the 
transmit power allocated to layer i, then the transmit symbol 
x can be written as 

x = \f~P\xi + \fP~2x 2 -\ \-\/P~mxm, (1) 

where x%,. .. ,xm are iid ZMCSCG random variables with 
unit variance. Suppose the layers are evenly spaced, with 
7?:+i — 7i = A7. In Section[lV]we consider the limiting process 
as A7 — > to obtain the power distribution: 

p(7) -iSoi^ /A ^' (2) 

where for discrete layers the power allocation Pi is referenced 
by the integer layer index i, while the continuous power 
distribution p(j) is indexed by the channel power gain 7. 

With successive decoding [20], each virtual receiver first 
decodes and cancels the lower layers before decoding its own 
layer; the undecodable higher layers are treated as noise. Thus 
the rate Ri intended for virtual receiver i is 

fii=log(l+ 2ffi A O) 

V 1 + 7* Ej= i+ i p i' 

where the term Y^jLi+i Pj represents the interference 
power from the higher layers. Suppose 7^ is the realized 
channel power gain, then the original receiver can decode 
layer k and all the layers below it. Hence the realized rate 
i? r i z (fc) at the original receiver is Ri + ■ ■ ■ + Rk. 

From the rate distortion function of a complex Gaussian 
source [20], the mean squared distortion is 2~ bR when the 
source is described at a rate of bR per symbol. Thus the 
realized distortion D r \ z (k) of the reconstructed source s is 

D l]z {k) = 2~ bR ^ {k) = 2- b( - R i+-+ R «\ (4) 

where the last equality follows from successive refinability 
[14], [15]. The expected distortion E#[D] is obtained by 
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Fig. 3. Power allocation between two adjacent layers. 



averaging over the fading distribution: 
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where TJj represents the cumulative power in layers i and 
above: % = J2f =i Pj, for i = 1, . . . , M; T M+1 = 0. In the 
next section we derive the optimal cumulative power allocation 
T 2 *, . . . , T^j to find the minimum expected distortion E# [D]*- 

IV. Optimal Power Distribution 

To derive the minimum expected distortion, we factor the 
sum of cumulative products in (0 and rewrite the expression 
as a set of recurrence relations: 
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. r , W^+l). (7) 

where i runs from M — 1 down to 1. The term D* can 
be interpreted as the cumulative distortion from layers i and 
above, with D\ equal to the minimum expected distortion 
E [£)]*. Note that Di depends on only two adjacent power 
allocation variables T and Ti+\\ therefore, in each recurrence 
step i in 0, we solve for the optimal T* +l in terms of T. 

Specifically, consider the optimal power allocation between 
layer 7 and its lower layer 7 — A7 as shown in Fig. Let 
Tin ~ A7) denote the available transmit power for layers 7 — 
A7 and above, of which Tin) * s allocated to layers 7 and 
above; the remaining power Tin) ~ T{j — A7) is allocated 
to layer 7 — A7. Under optimal power allocation, it is shown 
in [18] that the cumulative distortion from layers 7 and above 
can be written in the form: 



£>*( 7 ) = (i + 7 r(7)) V( 7 ), 



(8) 



where Win) K interpreted as an equivalent probability weight 
summarizing the aggregate effect of the layers 7 and above. 
For the lower layer in Fig. /( 7 )A 7 represents the proba- 
bility that layer 7 — A7 is realized. 

In the next recurrence step as prescribed by 0, the cumu- 
lative distortion for the lower layer is 
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(11) 



We solve the minimization by forming the Lagrangian: 

L(T( 7 ),A l! A 2 ) = 

Din - A 7 ) + Ai {Tin) - T^ - A 7 )) - A 2 T( 7 ). 

The Karush-Kuhn-Tucker (KKT) conditions stipulate that the 
gradient of the Lagrangian vanishes at the optimal power 
allocation T*( 7 ), which leads to the solution: 

U{i) if U{i) < T( 7 - A 7 ) (12a) 

T( 7 - A 7 ) else, (12b) 



T*( 7 ) 



where 



U{ 7 )± 



if 7 > win)/ fin) 
w(n) 



/( 7 )( 7 - A 7 ) 
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else. 
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We assume there is a region of 7 where the cumulative 
power allocation is not constrained by the power available 
from the lower layers, i.e., U( 7 ) < C/( 7 — A 7 ) and t/( 7 ) < P. 
In this region the optimal power allocation T*( 7 ) is given by 
the unconstrained minimizer U{j) in ( 1 12ab . In the solution 
to t7( 7 ) we need to verify that f/( 7 ) is non-increasing in 
this region, which corresponds to the power distribution p* ( 7 ) 
being non-negative. With the substitution of the unconstrained 
cumulative power allocation U ( 7 ) in ( TTOl ). the cumulative 
distortion at layer 7 — A7 becomes: 



D*( 7 -A 7 ) = 



1 + ( 7 - A 7 )r( 7 - a 7 ) 

1 + (7 - A 7 )C/( 7 ) 
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which is of the form in (0 if we define W( 7 — A 7 ) by the 
recurrence equation: 

Wi 1 -A 1 ) = (l + ( 7 -A 7 )C/( 7 )) b 

• [/( 7 )A 7 +(1 + 7^( 7 ))"V( 7 )]. 

Next we consider the limiting process as the spacing be- 
tween the layers condenses. In the limit of A 7 approaching 
zero, the recurrence equations (fl4l . (l5[ become differential 
equations. The optimal power distribution p*( 7 ) is given by 
the derivative of the cumulative power allocation: 



p*in) = -T*'ii/), 



(16) 



where T*( 7 ) is described by solutions in three regions: 

( 7 > 7o (17a) 

T*( 7 ) = \ U{j) 1P < 7 < 7o (17b) 

[ P 7 < 7 p. (17c) 

In region ( I17ab when 7 > j a , corresponding to cases (1 12ab and 
(113ab . no power is allocated to the layers and (TT~5T > simplifies 
to W{~f) = 1 - F(j), where F( 7 ) = J 7 /(s) ds is the 
cumulative distribution function (cdf) of the channel power 
gain. The boundary 70 is defined by the condition in ( 1 1 3ab 
which satisfies: 



7o /( 7o ) + F{lo) -1 = 0. 



(18) 



Under Rayleigh fading when f(j) 



-7/7 



where 7 is 



the expected channel power gain, ( fTST l evaluates to -f Q — 7. For 
other fading distributions, j a may be computed numerically. 

In region ( 1 17bb when 7p < 7 < 70, corresponding to cases 
( 1 12ab and ( I13bl ). the optimal power distribution is described 
by a set of differential equations. We apply the first order 
binomial expansion (1 + Aj) b = 1 + 6A7, and (fT3T > becomes: 



lim 
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which we substitute in dl3bt to obtain 

'2/ 7 + /'(7)//(7)\ 
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Hence f (7) is described by a first order linear differential 
equation. With the initial condition U("f ) = 0, its solution is 
given by 

rr 1,9 /'( s )< 



C/( 7 ) = 
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and condition (|12b| ) in the lowest active layer becomes the 
boundary condition U(-fp) — P. In [19], the power distribu- 
tion in ( 1221 is derived using the calculus of variations method. 

Similarly, as A7 — > 0, the evolution of the expected 
distortion in ( TBI becomes: 

67^(7) 



£>'( 7 ) 



1 



7*7(7) 
b (2 f( 7 )\ 

1+6W /( 7 ); 
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D( 7 )-/(7), (24) 



which is again a first order linear differential equation. With 
the initial condition D(^ a ) = W(~f ) = j f('Jo)< its solution 
is given by 



D( 7 ) = 
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Finally, in region (I17cb when 7 < 7p, corresponding to 
case ( I12bl i. the transmit power P has been exhausted, and no 
power is allocated to the remaining layers. Hence the minimum 
expected distortion is 



E H [D}* = D(0) = F( 7P ) + D( 7P ), 



(26) 



where the last equality follows from when 7 < 7p in region 
023, p*( 7 ) = and D( 7 ) = J 7 7P /(s) ds + D{ lP ). 

V. Rayleigh Fading with Diversity 

In this section we consider the optimal power distribution 
and the minimum expected distortion when the wireless chan- 
nel undergoes Rayleigh fading with a diversity order of L 
from the realization of independent fading paths. Specifically, 
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Fig. 4. Optimal power distribution (P = dB). 



we assume the fading channel is characterized by the Erlang 
distribution: 

( J L/ 7 ) i 7 i - 1 e- L ^ 



7 >0, 



(27) 



(L-iy. 

which corresponds to the average of L iid channel power 
gains, each under Rayleigh fading with an expected value 
of 7. The L-diversity system may be realized by having 
L transmit antennas using isotropic inputs, by relaxing the 
decode delay constraint over L fading blocks, or by having 
L receive antennas under maximal-ratio combining when the 
power gain of each antenna is normalized by 1/L. 

Fig. |4] shows the optimal power distribution p*(7), which is 
concentrated over a range of active layers. A higher SNR P or 
a larger bandwidth ratio b extends the span of the active layers 
further into the lower layers but the upper boundary j a remains 
unperturbed. It can be observed that a smaller bandwidth ratio 
b reduces the spread of the power distribution. In fact, as b 
approaches zero, the optimal power distribution that minimizes 
expected distortion converges to the power distribution that 
maximizes expected capacity. To show the connection, we 
take the limit in the distortion-minimizing cumulative power 
distribution in 



lim Uh) = 



1 - Fjj) - 7/(7) 
7 2 /(7) 



(28) 



which is equal to the capacity-maximizing cumulative power 
distribution as derived in [3]. Essentially, from the first order 
expansion e b = 1 + b for small b, Eh[D] = 1 — &Ejj[C7] when 
the bandwidth ratio is small, where E#[C] is the expected 
capacity in nats/s, and hence minimizing expected distortion 
becomes equivalent to maximizing expected capacity. For 
comparison, the capacity-maximizing power distribution is 
also plotted in Fig. |U Note that the distortion-minimizing 
power distribution is more conservative, and it is more so as 
b increases, as the allocation favors lower layers in contrast to 
the capacity-maximizing power distribution. 

Fig. [5] shows the minimum expected distortion E#[£>]* 
versus SNR for different diversity orders. With infinite diver- 
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Fig. 5. Minimum expected distortion (b = 2). 

sity, the channel power gain becomes constant at 7, and the 
distortion is given by 

D\ L =oo = (l+^P)- b . (29) 

In the case when there is no diversity (L = 1), a lower bound 
to the expected distortion is also plotted. The lower bound 
assumes the system has CSI at the transmitter (CSIT), which 
allows the transmitter to concentrate all power at the realized 
layer to achieve the expected distortion: 

poo 

E H [D C sn}= e-^{l+ 1 P)- b d 1 . (30) 
Jo 

Note that at high SNR, the performance benefit from diversity 
exceeds that from CSIT, especially when the bandwidth ratio 
b is large. In particular, in terms of the distortion exponent 
A [9], it is shown in [11] that in a MISO or SIMO channel, 
layered broadcast coding achieves: 

A = - lim l0gEg[g] = min(6, L), (31) 
P^oo logP 

where L is the total diversity order from independent fading 
blocks and antennas. Moreover, the layered broadcast coding 
distortion exponent is shown to be optimal and CSIT does not 
improve A, whereas diversity increases A up to a maximum 
as limited by the bandwidth ratio b. 

VI. Conclusion 

We considered the problem of source-channel coding over 
a delay-limited fading channel without CSI at the transmitter, 
and derived the optimal power distribution that minimizes the 
end-to-end expected distortion in the layered broadcast coding 
transmission scheme with successive refinement. In the case 
when the channel undergoes Rayleigh fading with diversity 
order L, the optimal power distribution is congregated around 
the middle layers, and within this range the lower layers are 
assigned more power than the higher ones. As SNR increases, 
the power distribution of the higher layers remains unchanged, 
and the extra power is allocated to the idle lower layers. 
Furthermore, increasing the diversity L concentrates the power 



distribution towards the expected channel power gain 7, while 
a larger bandwidth ratio b spreads the power distribution 
further into the lower layers. On the other hand, in the limit as 
b tends to zero, the optimal power distribution that minimizes 
expected distortion converges to the power distribution that 
maximizes expected capacity. While the expected distortion 
can be improved by acquiring CSIT or increasing the diversity 
order, it is shown that at high SNR the performance benefit 
from diversity exceeds that from CSIT, especially when the 
bandwidth ratio b is large. 
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